News list for " super fast"

DeepSeek Launches NSA for Ultra-Fast Long Context Training and Inference

On February 18, DeepSeek launched NSA. DeepSeek claims that NSA is a hardware-consistent and natively trainable sparse attention mechanism for ultra-fast long-context training and inference. With an optimized design for modern hardware, NSA speeds up inference while reducing pre-training costs without affecting performance. It performs on general benchmarks, long-context tasks, and instruction-based inference equal to or better than full attention models.

clock
2025-02-18 08:34:43
Disclaimer:
1. The information provided does not constitute investment advice. Investors should make independent decisions and bear all risks themselves.
2. The copyright of this content belongs to the original author. The views expressed herein are solely those of the author and do not represent the stance or position of this website.